Identification of Character Adjectives from Mahabharata
نویسندگان
چکیده
The present paper describes the identification of prominent characters and their adjectives from Indian mythological epic, Mahabharata, written in English texts. However, in contrast to the traditional approaches of named entity identification, the present system extracts hidden attributes associated with each of the characters (e.g., character adjectives). We observed distinct phrase level linguistic patterns that hint the presence of characters in different text spans. Such six patterns were used in order to extract the characters. On the other hand, a distinguishing set of novel features (e.g., multi-word expression, nodes and paths of parse tree, immediate ancestors etc.) was employed. Further, the correlation of the features is also measured in order to identify the important features. Finally, we applied various machine learning algorithms (e.g., Naive Bayes, KNN, Logistic Regression, Decision Tree, Random Forest etc.) along with deep learning to classify the patterns as characters or noncharacters in order to achieve decent accuracy. Evaluation shows that phrase level linguistic patterns as well as the adopted features are highly active in capturing characters and their adjectives.
منابع مشابه
Author gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملA Computational Analysis of Mahabharata
Indian epics have not been analyzed computationally to the extent that Greek epics have. In this paper, we show how interesting insights can be derived from the ancient epic Mahabharata by applying a variety of analytical techniques based on a combination of natural language processing, sentiment/emotion analysis and social network analysis methods. One of our key findings is the pattern of sig...
متن کاملTowards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning
In this paper we present a method to group adjectives according to their meaning, as a first step towards the automatic identification of adjectival scales. We discuss the properties of adjectival scales and of groups of semantically related adjectives and how they imply sources of linguistic knowledge in text corpora. We describe how our system exploits this linguistic knowledge to compute a m...
متن کاملThe Effect of Contrastive Analysis on Iranian Intermediate EFL Learners of L2 Adjective Knowledge
Contrastive analysis of hypothesis is the comparison of the linguistic system of two or more languages and it is based on the main difficulties in learning a new language that caused by interference from the first language. The present study intended to investigate the effect of contrastive analysis on Iranian intermediate EFL learners’ knowledge of L2 adjectives. The questi...
متن کاملAutomatic Extraction of Polar Adjectives for the Creation of Polarity Lexicons
Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time and efforts in the first steps of Sentiment Analysis. In this paper we present a methodology based on linguistic cues that allows us to automatically discover, extract and label subjective adjectives that should be collected in a domain-based polarity lexicon. For this purpose, we designed a bootstra...
متن کامل